An Efficient Algorithm for Identifying the Most Contributory Substring
نویسنده
چکیده
Detecting repeated portions of strings has important applications to many areas of study including data compression and computational biology. This paper defines and presents a solution for the Most Contributory Substring Problem, which identifies the single substring that represents the largest proportion of the characters within a set of strings. We show that a solution to the problem can be achieved with an O(n) running time (where n is the total number of characters in all of the input strings) when overlapping occurrences of the most contributory substring are permitted. Furthermore, we present an extended algorithm that does not permit occurrences of the most contributory substring to overlap. The expected running time of the extended algorithm is O(n logn) while its worst case performance is O(n).
منابع مشابه
Identifying Rhythms in Musical Texts
A fundamental problem in music is to classify songs according to their rhythm. A rhythm is represented by a sequence of “Quick” (Q) and “Slow” (S) symbols, which correspond to the (relative) duration of notes, such that S = 2Q. In this paper, we present an efficient algorithm for locating the maximum-length substring of a music text t that can be covered by a given rhythm r.
متن کاملIdentifying and Evaluating Effective Factors in Green Supplier Selection using Association Rules Analysis
Nowadays companies measure suppliers on the basis of a variety of factors and criteria that affect the supplier's selection issue. This paper intended to identify the key effective criteria for selection of green suppliers through an efficient algorithm callediterative process mining or i-PM. Green data were collected first by reviewing the previous studies to identify various environmental cri...
متن کاملThe Efficient Computation of Complete and Concise Substring Scales with Suffix Trees
Strings are an important part of most real application multivalued contexts. Their conceptual treatment requires the definition of substring scales, i.e., sets of relevant substrings, so as to form informative concepts. However these scales are either defined by hand, or derived in a context-unaware manner (e.g., all words occuring in string values). We present an efficient algorithm based on s...
متن کاملGeneralized Substring Compression
In substring compression one is given a text to preprocess so that, upon request, a compressed substring is returned. Generalized substring compression is the same with the following twist. The queries contain an additional context substring (or a collection of context substrings) and the answers are the substring in compressed format, where the context substring is used to make the compression...
متن کاملAn Energy-efficient Mathematical Model for the Resource-constrained Project Scheduling Problem: An Evolutionary Algorithm
In this paper, we propose an energy-efficient mathematical model for the resource-constrained project scheduling problem to optimize makespan and consumption of energy, simultaneously. In the proposed model, resources are speed-scaling machines. The problem is NP-hard in the strong sense. Therefore, a multi-objective fruit fly optimization algorithm (MOFOA) is developed. The MOFOA uses the VIKO...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007